Genome size variation within Crithmum maritimum: Clues on the colonization of insular environments

Abstract Angiosperms present an astonishing diversity of genome sizes that can vary intra‐ or interspecifically. The remarkable new cytogenomic data shed some light on our understanding of evolution, but few studies were performed with insular and mainland populations to test possible correlations with dispersal, speciation, and adaptations to insular environments. Here, patterns of cytogenomic diversity were assessed among geographic samples (ca. 114) of Crithmum maritimum (Apiaceae), collected across the Azores and Madeira archipelagos, as well as in adjacent continental areas of Portugal. Using flow cytometry, the results indicated a significant intraspecific genome size variation, spanning from reduced sizes in the insular populations to larger ones in the mainland populations. Moreover, there was a tendency for an increase in genome size along the mainland populations, associated with lower temperatures, higher precipitation, and lower precipitation seasonality. However, this gradient might be the result of historic phylogeographical events associated with previous dispersal and extinction of local populations. Overall, our findings provided evidence that smaller genome sizes might play a critical role in the colonization of islands, corroborating other studies that argue that organisms with smaller genomes use fewer resources, having a selective advantage under insular environments. Although further studies are needed to improve our understanding of the mechanisms underlying genome size evolution on islands, conservation strategies must be promoted to protect the rich cytogenomic diversity found among C. maritimum populations, which occur in coastal areas that are particularly threatened by human activity, pollution, invasive species, and climate changes.

Polyploidization, and more recently, transposable elements have been recognized as important sources of speciation (Craddock, 2016;Wood et al., 2009). Both phenomena lead initially to an increase in the genome size (Pellicer et al., 2018); nevertheless, it is important to note that these processes are reversible, being the plants equipped with mechanisms that lead to genome downsizing (e.g., chromosomal rearrangement, large-scale loss of repetitive sequences and duplicated genes, elimination of transposed copies) (Chen et al., 2007;Freeling et al., 2012;Vitte & Panaud, 2005;Wendel, 2015).
Moreover, GS can also have a significant impact at the ecological and evolutionary level (Biémont, 2008), since it correlates not only with several phenotypic traits, such as flowering time, flower size, seed mass, and photosynthetic rate (e.g., , Beaulieu, Moles, et al., 2007, 2008Meagher & Vassiliadis, 2005) but also with environmental variables, namely altitude, latitude, and temperature (e.g., Knight et al., 2005;Suda et al., 2005), and it was shown to have implications in ecological adaptation in plants (e.g., Ramsey, 2011). In fact, it was discovered that the GS is negatively linked with altitude in the Zea (Díez et al., 2013) and Berberis (Bottini et al., 2000) genera.
Although, remarkable cytogenomic data are shedding some light on our understanding of evolution, few studies were performed with plant lineages from Macaronesian Islands (i.e., Azores, Madeira, Selvagens, Canary Islands, and Cabo Verde) which harbors a rich endemic flora, of approximately 900 vascular plant species (Florencio et al., 2021). Since the seminal study of Suda et al. (2003), using the Canary endemic flora, few studies were performed in Macaronesia.
Furthermore, several correlations with environmental variable were found in Suda et al. (2003Suda et al. ( , 2005, for example, the Argyranthemum, Micromeria, and Silene genera presented positive correlations between mean annual temperature and GS and negative correlations between GS and rainfall and altitude. On the other hand, the opposite trend was observed in the Crambe and Sonchus genera. Brilhante et al. (2021) found negative correlations between annual mean precipitation and GS for Aeonium in Tenerife.
Since GS correlates with phenotypic and ecological traits it can be inferred that GS is shaped by natural selection, nonetheless, genetic drift has also been proposed to explain GS variation (Oliver et al., 2007;Whitney et al., 2010). To test if GS undergoes selection, population-level analyses are ideal (Díez et al., 2013). The sea fennel or rock samphire Crithmum maritimum L. (Apiaceae), a monospecific genus (Castroviejo, 2003;Meot-Duros & Magné, 2009), is a facultative halophyte that grows in rocky sea cliffs and occasionally in sands and gravel ( Figure 1). It has a very wide distribution, occurring along the European Atlantic coasts, the Azores, Madeira, and Canary archipelago, the Mediterranean and Black Sea coast, and northwest Africa, with its distribution being limited by temperature F I G U R E 1 Plants of Crithmum maritimum (a) growing in sands in Viana do Castelo; (b) growing in coastal rocky cliffs in São Jorge, Madeira; and (c) detail of its inflorescences. (Photos by Guilherme Roxo and Maria Romeiras). (Castroviejo, 2003;Crawford, 1982). However, due to climate change, C. maritimum distribution is currently expanding northwards (Metzing & Gerlach, 2001). C. maritimum is an aromatic herb with therapeutic healing properties known since ancient times and it was mentioned by Hippocrates in the 4th century BCE to soothe vesical pains (Pline, 1957), and used by sailors who ate C. maritimum fresh leaves to prevent scurvy (Baytop, 1984). Over the centuries, the use of this plant has decreased, but in the 21st century more studies have been published suggesting its potential as a crop (e.g., Renna, 2018). In fact, its leaves hold high contents of carotenoids, flavonoids, vitamin C, and other bioactive substances (Özcan et al., 2001). Moreover, recent studies identified high levels of both genetic (Latron et al., 2018(Latron et al., , 2020 and phytochemical differentiation among C. maritimum populations (Katsouri et al., 2001;Kulišicbilušic et al., 2010;Maleš et al., 2001). However, cytogenomically, C. maritimum was analyzed only from one location (i.e., Strunjan saltpan in Slovenia), resulting in an estimation of a 2C-value of 4.38 pg (Koce et al., 2008).
The present study aimed to investigate the cytogenomic variation of C. maritimum at the population level, and if it correlated with environmental variables. Namely, we intended to determine the amount of variation of GS among populations, and to test an association with a geographical or environmental variables. Additionally, we discussed our results from an evolutionary and taxonomic point of view, in accordance with the current theories of genome evolution.

| Climatic characterization and biogeographic regions of the study area
The Azorean archipelago belongs to the Atlantic European Province (Costa et al., 1998) and is located in the North Atlantic ridge between the latitudes of 36° 45′N and 39° 43′N and the longitudes of 24° 45′W and 31° 17′W. Its climate is wet, cloudy, and with mild temperatures all year round. This is because from September to March the Azores is crossed by the North Atlantic Storm-track and the rest of the year by the Azores anticyclone.
The Madeira archipelago belongs to the Madeirese Province (Costa et al., 1998) and is situated to the Southeast of continental Portugal between 32° 24′ and 33° 27′N in latitude and 16° 16′ and 17° 16′W in longitude. The climate is influenced by trade winds from the N and NE and by its orography. The Northern slopes are more humid and have higher rainfall and less sunlight compared to the southern slope. The mean annual air temperature is between 8°C in the highest peak and 18-19°C in the coastal region.
The mainland Portugal climate is characterized by a temperate climate with rainy winters and hot, dry summers. Across its territory nine biogeographic regions can be recognized (Costa et al., 1998) from which four were included in our sampling. The northernmost sector is the Galician Portuguese Sector which belongs to the Eurosiberian Region being characterized by a temperate and rainy climate without a clear dry season. The other three sectors belong to the Andalusian Lusitanian Coastal Province which is characterized by mild climate. The Portuguese Divisorian Sector the most northern one of the three is mostly located in the lower mesomediterranean level; however, the coastal zones are in the sub-humid upper levels of the thermomediterranean climate. The Ribataganian-Sadese Sector is characterized by a thermomediterranean sub-humid climate and the Algarvese-Monchiquense Sector by a thermomediterranean dry to subhumid climate.  Table S1). From each population, a minimum of three specimens was collected. For each site, we recorded geographical coordinates and altitude with a GPS device. Each sample was collected and preserved in wet tissue paper, wrapped in aluminum foil and ziplock bags, then preserved at 5°C, and posted to the laboratory.

| Cytogenomic analysis
Nuclear DNA content was estimated using FCM. Preparation of suspensions of intact nuclei for analysis was performed following the method of Galbraith et al. (1983). The fresh young leaves were chopped with a razor blade in a Petri dish containing 1 mL of Woody Plant Buffer (WPB 0.2 M Tris-HCl, 4 mM MgCl 2 , 1% Triton X-100, Na 2 EDTA 2 mM, NaCl 86 mM, sodium metabisulfite 20 mM, 1% PVP-10, pH 7.5; Loureiro et al., 2007). The nuclear suspension was sieved using a nylon mesh with 30 μm to remove large debris. Then, nuclei were stained with 25 μg ml −1 and a volume of 50 μL of propidium iodide (PI; Sigma-Aldrich, USA). To estimate the nuclear DNA content, DNA from Solanum lycopersicum L. 'Stupické' (2C = 1.96 pg; Doležel et al., 1992) was used as reference standard. The acquisition of numeric data and fluorescence graphs was made by Sysmex FloMax software v2.4d (Sysmex, Görlitz, Germany), as described by Guilengue et al. (2020). The histograms for each sample were recorded and the C-values were calculated with the following formula:

| Statistical analysis
Statistical analyses and descriptive statistics were performed using R v4.2.21 software (R Core Team, 2020). We followed the same general approach already outlined by our team in a previous paper , which is summarized below. (iii) among mainland biogeographic regions and the two archipelagos. Box-Cox or other conventional transformation techniques (Box & Cox, 1964;Zar, 2010) did not normalize our data (p < .05 with the Shapiro-Wilk test) (Shapiro & Wilk, 1965). Thus, group comparisons were carried out with nonparametric tests. The Mann-Whitney and

| Basic statistics
Kruskal-Wallis tests were performed for comparisons between two groups or more than two groups, respectively, and in the case of a rejection of the null hypothesis, the latter was followed by nonparametric multiple comparison test (Conover & Iman, 1979;Siegel & Castellan, 1988), using the function posthoc.kruskal.conover.test of the "The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR)" R package (Pohlert, 2014), This function allows a Bonferroni-type adjustment of p-values to ensure a high level of statistical power, by reducing the probability of performing a type II error.

| Generalized linear models
To assess the factors affecting the 2C-values, we calculated Gaussian generalized linear models (GLMs) following two scenarios: variables are added, their properly normalized sum tends to a normal distribution, independently of the original variable distribution.
That is, with a large sample size (i.e., more than 100 observations in this case) the mean values tend to a normal distribution (Kwak & Kim, 2017). Therefore, as suggested in a previous work by our team , we considered that the application of the GLMs is appropriate.

| Bioclimatic factors
Nineteen climatic variables were used in the present study (for data see Appendix S1-  et al., 2017, 2020).
To extract the principal components of the 19 bioclimatic variables, we used the "vegan" package for R and followed the Kaiser-Guttman and broken stick model criteria to determine the number of components to retain, that is, those with eigenvalues above the mean eigenvalue and the broken stick model (see Borcard et al., 2011). We then interpreted the bioclimatic meaning of the retained components, based on their correlations (i.e., loadings) with the initial bioclimatic variables, and determined the amount of variation explained by each retained component, based on their respective eigenvalues.

| Genome size variation in islands and mainland populations
The cytogenomic results for the 114 populations were summarized in Appendix S1-  Figure 3).
Regarding the islands, the Azorean population presented the smallest mean 2C-values (4.217 ± 0.093 pg), followed by Madeira (4.348 ± 0.107 pg). In the mainland, a geographic gradient can be observed, with the northern populations presenting larger genomes in comparison to the southern ones (see Figure 2). Overall, Fajã das Achadas da Cruz in Madeira Island presented the smallest genome (4.074 ± 0.079 pg), and Praia dos Barcos, Porto, in the mainland, the largest one (5.047 ± 0.145 pg; for data see Appendix S1- Table S3). Regarding the biogeographic regions by Costa et al. (1998) (Figure 4).

| Climatic variables and genome size
When considering the whole dataset, the best GLM corresponded to the junction of the biogeographic regions with the bioclimatic variables (Table 1), separating Azores and Madeira from the biogeographic regions in the mainland, which agrees with the non-parametric analysis shown in Figures 3 and 4. Although the differences between islands and the mainland were significant (Figure 3), this model was less informative than the model discriminating biogeographic regions both in islands and mainland ( Figure 4). The latter model not only allowed to clearly separate Azores and Madeira from the mainland populations, but also incorporated the considerable variation found among the latter. Regarding the bioclimatic variables, we retained three main components characterizing the climate to be found at all included populations (Table 2; for data see Appendix S1- Figure S1). When considering mainland data only, the fine information provided by the climate (i.e., the two main climatic components extracted from the bioclimatic data) (see Table 2, for data see Appendix S1- Figure S1), provided the best GLM model (Table 1).
PC1 was associated with high temperatures, low precipitation, and high precipitation seasonality. PC2 was associated with low temperature variation, low temperature in the driest/ warmest part of the year, high temperature in the coldest/wettest part of the year.
The two retained components explained 60.7% and 32.7% of the variation in the data, respectively. Latitude and longitude provided significant but relatively low fit models, while altitude provided a non-significant model (Table 1). Our results seem to demonstrate a gradient from south to north (with GS increasing in that direction, Spearman correlation, r = 0.46, p < .001) and from east to west (with GS decreasing in that direction, Spearman correlation, r = 0.42, p < .001), the latter being the result of smaller genomes in island populations. The correlation of GS with the second main climatic component was negative (Spearman correlation, r = −0.28, p = .01481) meaning that larger values of GS would be found at places with high temperature variation, high temperature in the driest/warmest part of the year, and low temperature in the coldest/wettest part of the year (see Table 2). The correlation of GS with the first main climatic  closely related taxa (Castro et al., 2013;Roxo et al., 2021;Zahradníček et al., 2018). There are few studies which investigate the intraspecific

| Genome size variation and colonization of insular environments
Our results revealed that the island populations presented smaller genome sizes and less variation when compared to mainland ones.
Moreover, the data collection of 114 populations across the Portuguese territory revealed that the biogeography, separating Azores, Madeira and the biogeographic region in the mainland plays critical role in shaping genome size. When considering the entire dataset, geographic isolation and the distinction between insular and continental habitats appear to be the most important factors in shaping genome size. The tendency toward smaller genome size in endemic species was observed by Suda et al. (2003) and (2005) in the Canary Islands, and by Kapralov and Filatov (2011) in the Hawaiian and Marquesas archipelagos. Therefore, it seems that small genomes are advantageous when colonizing new habitats. Kapralov and Filatov (2011) argue that smaller island genome sizes may be due to: (i) genome size downsizing during or after colonization, or (ii) predominance of colonizers with small genomes. Mechanisms underlying angiosperm genome size variation have only recently been better understood, with several correlations between genome size and ecological and evolutionary factors being investigated (Roddy et al., 2020).
At the molecular level, Suda et al. (2005) argues that smaller genomes of island samples might be more advantageous by reducing genetic instability. Furthermore, plants may not be able to colonize habitats with low phosphate and nitrogen levels if they have large genomes (Guignard et al., 2016;Šmarda et al., 2013). At the cellular level, large genomes imply larger nuclei, which in turn produce larger cells, whereas small genomes are more flexible in terms of cell size, and as cell size decreases, the ratio of cell surface area to cell volume rises exponentially (Roddy et al., 2020). Therefore, genome size indirectly restricts the maximum rate of stomatal opening and closing by having an impact on the sizes and densities of stomata (Drake et al., 2013;McAusland et al., 2016;Roddy et al., 2020), which in turn has an influence on maximum rates of leaf surface conductance to CO 2 , water, and ultimately photosynthetic metabolism per unit leaf surface area (Simonin & Roddy, 2018). Moreover, it appears from research on invasive species that invasive genotypes have smaller genomes and faster rates of stem elongation than their native genotypes (Lavergne et al., 2010), and studies with maize have established a negative association between genome size and the rate of cell production (Bilinski et al., 2018). Therefore, it seems that small genomes allow for greater variation in cell size and metabolism, facilitating the structure's adaptation to environmental changes (Knight et al., 2005;Roddy et al., 2020). Moreover, GS also seems to TA B L E 2 Correlation of PC axes with bioclimatic variables extracted, including all data and only mainland data, respectively. be correlated with breeding systems (Bennett, 1972) and Crithmum maritimum appears to be a selfing species because its genetic structure is similar to that of most selfing species (Latron et al., 2018).
The selfing is a breeding system that assures the reproduction and establishment of a sexual population from even a single colonizer (Crawford et al., 2015), which is advantageous when colonizing volcanic islands. Moreover, selfing species seem to have consistently smaller genome sizes through the reduction in transposable element numbers under the deleterious recessive model of transposable element numbers (Albach & Greilhuber, 2004). This model states that the transposable element number are reduced in selfers due to the greater homozygosity in selfing species, which increase the strength of selection against deleterious insertion that cannot be hidden by recessivity (Albach & Greilhuber, 2004;Morgan, 2001). Therefore, a geographic element related to island isolation and a model of IBD (isolation by distance) seems to be important in shaping the genome size, however, the maintenance through time of such a characteristic may be related to a lack of gene flow (Franks, 2010).
Although further studies are needed to improve our understanding of the mechanisms underlying genome size evolution, our findings seem to point out that smaller genome sizes are correlated with insular environments.

| Climatic variables and genome size variation
Here, we present a comparative study of genome size among the mainland populations incorporating environmental variables. The relationships between genome size and both bioclimatic and geographic variables have been widely studied among various groups of plants (Bottini et al., 2000;Brilhante et al., 2021;Díez et al., 2013;Suda et al., 2003Suda et al., , 2005. Hitherto, during the various studies carried out, no universal consistency was reached between them. In Larger genome sizes were found not only in populations located at sites with high temperature variation, high temperature in the driest/warmest part of the year, low temperature in the coldest/ wettest part of the year but also in places with low temperatures, high precipitation, and low precipitation seasonality. In other words, genome size tended to increase from south to north. Even in other life forms such as prokaryotes, it was already seen that a greater variability of the environment resulted in genomes with a larger number of genes (Bentkowski et al., 2015). However, it is important to note that these observations should be taken into consideration when extrapolating to other taxa, since negative (e.g., Bottini et al., 2000;Díez et al., 2013) and positive (e.g., Basak et al., 2019;Chrtek Jr. et al., 2009) correlations have been observed for longitude and latitude. The determination of nuclear DNA by flow cytometry can interfered by compounds on plants (Noirot et al., 2000;Price et al., 2000). C. maritimum is rich in several compounds such as tanins and flavonoids (Atia et al., 2011), such compounds can interfere with the staining and light scatter properties of the propidium iodide fluorochrome (Loureiro et al., 2006;Peluso et al., 2014).
Therefore, an explanation for the variation across the continental part can be due to different chemotypes that already have been observed across the Portuguese coast (Pateira et al., 1999). This artefactual variation, induced by the environment has been observed in B. bituminosa (Walker et al., 2006); nevertheless this study also observed true intraspecific related to geographic isolation (Insular vs. Continental populations).  Latron et al. (2018Latron et al. ( , 2020, who observed a lack of spatial trends in the genetic diversity of C. maritimum. These genomic variations may be the result of drift or of selection. However, in the case of continental populations, the latter appears to be the main evolutionary force in action, while in the case of the more distinct insular populations, drift, more specifically, founder effect may have played an important role in the mechanisms linked to genome size alteration (Blommaert, 2020).
In conclusions, further studies are needed to improve our understanding of the mechanisms underlying genome size evolution. funding acquisition (equal); investigation (equal); project administration (equal); resources (equal); supervision (equal); validation (equal);

AUTH O R CO NTR I B UTI O N S
writing -review and editing (equal).

ACK N OWLED G M ENTS
The authors would like to acknowledge the support provided by

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare that they have no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
Genome size (GS) dataset for the 114 populations of Crithmum maririmum will be deposited in Zenodo repository once the paper is accepted however a DOI has already been generated: https://doi.